Core-hours ≈ Wall-clock time × Number of CPU cores
Example
A program runs 2 hours using 8 cores = 16 core-hours
# Example output from /usr/bin/time -v
Elapsed (wall clock) time: 0:02:15
User time (seconds): 120.45
System time (seconds): 12.33
Maximum resident set size (kbytes): 2048000
Understanding this relationship helps optimize resource requests
Scaling Patterns: linear vs. quadratic
# Simple scaling test
./my_program --data 2percent_sample.csv # Measure time
./my_program --data 4percent_sample.csv # Should take ~2x if linear
./my_program --data 8percent_sample.csv # Should take ~4x if linear
If 8% data takes 10× longer than 2% data, you don’t have linear scaling!
Amdahl’s Law
#!/bin/bash
dataset="my_data.csv"
for cores in 1 2 4 8 16; do
echo "Testing with $cores cores..."
/usr/bin/time -f "cores=$cores wall=%E cpu=%U+%S" \
my_parallel_program --input $dataset --threads $cores \
2>> scaling_results.log
doneExample Results
Tip
Document environment (software versions, modules)
Run end-to-end tests (include all steps)
# Basic timing with detailed output
/usr/bin/time -v my_program input.dat
# Custom format for logging
/usr/bin/time -f 'wall=%E user=%U sys=%S maxRSS=%M' my_program input.datElapsed: Wall-clock timeUser/System time: CPU time breakdownMaximum resident set size: Peak memory (KB)Benchmarking my code with different data sizes
Scenario: Image processing pipeline
Goal: Process 10,000 images, estimate resources needed
# Create subsets of your full dataset
mkdir test_data_5pct test_data_10pct test_data_15pct
# Randomly sample (adjust numbers for your case)
shuf -n 500 full_image_list.txt > test_data_5pct/image_list.txt
shuf -n 1000 full_image_list.txt > test_data_10pct/image_list.txt
shuf -n 1500 full_image_list.txt > test_data_15pct/image_list.txtTip
Use random sampling to ensure representative subsets
#!/bin/bash
module load languages/python/3.7.12
for size in 5 10 15; do
echo "=== Testing ${size}% dataset ==="
for run in {1..3}; do
echo "Run $run..."
/usr/bin/time -f "${size}pct run$run: wall=%E user=%U sys=%S maxRSS_MB=%M" \
python image_pipeline.py \
--input test_data_${size}pct/ \
--output results_${size}pct_run${run}/ \
2>> timing_results.log
done
done# Extract timing data
grep "wall=" timing_results.log
# Results:
# 5pct run1: wall=0:02:15 user=2:01.45 sys=0:12.33 maxRSS_MB=2048
# 5pct run2: wall=0:02:18 user=2:03.12 sys=0:13.01 maxRSS_MB=2051
# ...
# 15pct run1: wall=0:04:45 user=4:15.22 sys=0:28.11 maxRSS_MB=2055Analysis:
5% data: ~135 seconds
10% data: ~285 seconds
15% data: ~580 seconds
Linear scaling confirmed!
100% estimate: ~2850 seconds = 27.5 minutes
Important
Always benchmark on compute nodes, not login nodes!
#!/bin/bash
#SBATCH --account=ABC012345
#SBATCH --partition=test
#SBATCH --job-name=benchmark
#SBATCH --output=benchmark_out_%j.txt
#SBATCH --error=benchmark_err_%j.txt
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=16G
#SBATCH --time=00:30:00
module load languages/python/3.7.12
export OMP_NUM_THREADS=8 # Match requested CPUs
echo "Starting benchmark at $(date)"
echo "Running on node: $(hostname)"
/usr/bin/time -v srun --cpu-bind=cores \
python my_program.py --input test_10pct.csv --threads 8# Get job statistics
sacct -j JOBID --format=JobID,Elapsed,TotalCPU,MaxRSS,ReqCPUS,State -P
# Example output:
# JobID|Elapsed|TotalCPU|MaxRSS|ReqCPUS|State
# 12345|00:15:30|01:58:45|2048000K|8|COMPLETEDResource Analysis:
Wall time: 15.5 minutes
CPU time: 118.75 minutes
Peak memory: ~2GB
Core-hours: 15.5 min × 8 cores = 2.07 core-hours
Scenario
Note
Step-by-step calculation:
Full problem: 5× larger (20% → 100%)
Time per run: 45 min × 5 = 225 min = 3.75 hours
Core-hours per run: 3.75 × 8 = 30 core-hours
Base cost: 30 × 50 runs = 1,500 core-hours
Safety factors:
Total request: 1,500 × 3.6 = 5,400 core-hours
For queries related to these training materials jgi-training@bristol.ac.uk
For other HPC queries hpc-help@bristol.ac.uk
Happy computing! 🚀